In [1]:
import time
from IPython import display
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
num_points = 100
X = np.arange(num_points)
Y = (X * 0.8 + 10) + np.random.normal(0, 5, num_points)
# Plot the points
plt.plot(X, Y, 'rx')
plt.show()
Now lets start to build our linear regression model. We'll create a tensorflow graph that defines a simple single variable linear regression model that uses squared loss and gradient descent optimization.
x
and the output variable y
. Note that the shape has been explicitly mentioned. This is not necessary however.
In [2]:
# First create placeholders for the input and output
# These placeholders will be later supplied with the data we created at execution time
x = tf.placeholder(tf.float32, shape=(num_points))
y = tf.placeholder(tf.float32, shape=(num_points))
Step 2: Now create tensorflow Variables for the Weight and Bias that have to be learned. Variables allow for its value to be changed at execution time. Since our parameters are to be "learned" and is updated after every epoch, we need to use Variables for the weight and bias term.
Variables MUST be supplied with an initial value that can be a tensor or a python object convertible to a tensor. It can also optionally take in the data type and a name (commonly used) among other parameters.
In [3]:
# Create variables to hold the weight and bias
W = tf.Variable(0.1, dtype=tf.float32, name="weight")
b = tf.Variable(0.1, dtype=tf.float32, name="bias")
In [4]:
# The linear model to compute Y
y_predicted = tf.add(tf.multiply(W, x), b)
# Compute all the deltas between the computed y and the actual Y and square the errors
squared_deltas = tf.square(tf.subtract(y_predicted, y))
# Aggregate all the deltas over all the examples and divide its by the number of examples
loss = tf.divide(tf.reduce_sum(squared_deltas), 2*num_points)
In [5]:
# Create a gradient descent optimizer with the set learning rate
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0005)
# Run the optimizer to minimize loss
# Tensorflow automatically computes the gradients for the loss function!!!
train = optimizer.minimize(loss)
While constants are initialized, the Variables are not. To do this, we must explicitly call the global_variables_initializer()
function.
In [6]:
# Initialize all variables
init = tf.global_variables_initializer()
Step 5: Lets run the graph. First lets create a session and initialize the variables. After that, we will run gradient descent over the entire training set for the specified number of training_epochs. We will capture the variables of interest and print it out to monitor the optimization process.
Note: Learning rates 0.0005
+ doesn't let the optimization converge. The weights/biases bounce between +/- values and increase infinity.
In [7]:
# Run the graph
with tf.Session() as sess:
# Initialize all variables
sess.run(init)
# For each epoch
for epoch in range(1000):
# Run the optimizer and get the loss
curr_W, curr_b, curr_loss, _ = sess.run([W, b, loss, train], feed_dict = {x: X, y: Y})
print("W: %.4f"%curr_W, "B: %.4f"%curr_b, "Loss: %.4f"%curr_loss)
# Plot the points
plt.plot(X, Y, 'rx', label="All Data")
# Plot the computed points as a line
plt.plot(X, curr_W * X + curr_b, label="Line of fit")
plt.legend()
plt.show()
# This must be used if the output/plot is being printed within the loop
display.clear_output(wait=True)
time.sleep(0.1)
And there you have it! The line fits! Playing with the hyperparameters can get better fits!